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PARTIAL  SEQUENCING  OF  BOTULINUM  NEUROTOXIN  E 

Hanspeter  Michel,  Paul  A.  Martino,  Nian-Zhou  Zhu,  Jeff  Shabanowitz  and  Donaid  F.  Hum 
University  of  Virginia,  Chemistry  Department,  Charlottesville,  VA  22901 


Neurotoxins  of  botulinum  Clostridium  are  scientifically  interesting  for  two  reasons.  First, 
they  are  extremely  toxic.  Second,  they  can  be  used  as  models  for  three  important  biological 
phenomena,  selective  recognition  by  a  target  cell,  transport  through  the  plasma  membrane 
and  toxic  activity.  All  three  activities  are  situated  on  one  polypeptide  of  approx.  150  kDa. 
Whereas  the  complete  gene  sequences  of  neurotoxins  A  and  Cl  were  published  very 
recently  only  parts  of  the  neurotoxin  E  sequence  is  known.  By  using  mass  spectrometry, 
supported  by  automated  Edman  degradation  we  were  able  to  deduce  approx.  50  kDa  of 
well  established  sequence  information.  Additionally,  we  also  found  approx.  30  kDa  of 
preliminary  sequence  information.  These  sequences  should  facilitate  to  complete  the 
sequence  of  neurotoxin  E.  Furthermore  it  should  be  used  for  the  identification  of 
posttranslational  modifications  which  are  of  crucial  importance  for  the  biological  activity 
of  the  protein. 
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1.  INTRODUCTION 

Botulinum  neurotoxins,  produced  in  Clostridium  botulinum,  can  be  classified  into  seven 
types,  A,B,C1tD,E,F  and  G  (1).  Botulinum  neurotoxins  are  synthesized  as  aprox.  150  kDa 
single  chain  precursor  which  is  not  or  only  weakly  toxic.  These  precursors  are  then 
posttranslationally  modified  into  the  highly  toxic  form  (2,3).  Two  types  of  posttranslational 
modifications  are  described.  1)  Proteolytic  cleavage  (nicking)  at  a  special  susceptible 
position  into  a  heavy  (aprox.  100  kDa)  and  a  light  (aprox.  50  kDa)  chain,  which  are  hold 
together  by  at  least  one  disulfide  bridge  (2).  2)  activation  by  proteases  (4-7).  Nicking  alone 
has  not  been  found  to  be  responsible  for  the  activation  of  the  protein  (8).  TTiis  is  supported 
by  the  fact  that  neurotoxins  B  and  E  are  not  nicked  but  are  activated  by  proteases  (4-7). 
Whereas  the  site  for  the  nicking  is  described  to  be  at  a  well  defined  position,  little  is  known 
about  the  exact  mechanism  of  the  actual  activation.  Recently  a  trypsin  like  protease  from 
Clostridium  botulinum  type  A  has  been  purified  and  characterized  (9).  This  protease  cleaves 
single  chain  type  A  botulinum  neurotoxin  into  the  two  chain  form.  Although  botulinum 
neurotoxin  E  exerts  its  toxicity  as  intact  single  chain  protein  it  can  easily  be  nicked  by 
trypsin  as  well  as  Lys-C  (10). 

Botulinum  neurotoxins  are  multifunctional  proteins.  Their  action  as  highly  toxic  substances 
can  be  described  in  three  different  steps.  1)  Selective  binding  to  receptors  on  the  surface 
of  the  nerve  cell  plasma  membrane.  2)  Transfer  of  the  protein  through  the  plasma 
membrane  into  the  cytoplasm.  3)  Catalytic  function  in  the  cytoplasm,  which  produces  nerve 
cell  dysfunction.  In  analogy  to  other  structurally  related  toxins,  different  regions  of  the 
protein  can  be  attributed  with  the  different  functions.  For  a  review  see  (11,12).  Whereas 
the  light  chain  is  believed  to  contain  the  catalytic  function,  the  C-terminus  of  the  heavy 
chain  seems  to  be  responsible  for  selective  binding  and  the  N-terminus  for  internalization. 

To  fully  understand  all  aspects  of  action  of  botulinum  neurotoxins  exact  knowledge  of  the 
primary  sequence,  posttranslational  modifications  as  well  as  higher  order  structures  is 
essential.  Until  recently  only  partial  sequence  information  of  botulinum  neurotoxins  were 
available  (13-16).  Recently  the  complete  sequence  of  botulinum  neurotoxin  A  (17)  and 
botulinum  neurotoxin  Cl  (18)  have  been  reported.  Together  with  the  complete  sequence 
of  botulinum  neurotoxin  A  was  published  a  273  amino  acid  residues  long  piece  of  the  N- 
terminus  of  botulinum  neurotoxin  E.  These  sequences  were  derived  from  the  corresponding 
gene  sequence.  In  this  report  we  present  aprox.  50  kDa  of  the  primary  sequence  of  the 
150  kDa  of  neurotoxin  E. 
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2.  MATERIAL  AND  METHODS 

Botulinum  neurotoxin  E  and  a  chymotryptic  digest  were  provided  by  Dr.  James  Schmidt. 
All  preparations  were  assayed  for  non  toxicity  previous  to  sending.  Trypsin  and  Glu-C 
protease  were  sequencing  grade  from  Boehringer  Mannheim.  CNBr  was  from  Aldrich.  All 
solvents  for  high  pressure  liquid  chromatography  were  HPLC  grade.  All  other  chemicals  and 
solvents  were  of  highest  available  purity. 

Purification  of  botulinum  neurotoxins  is  described  elsewhere  (19). 

Digestion  with  trypsin.  Aprox.  1  nmol  of  pyridylethylated  neurotoxin  E  was  dissolved  in  1 
pi  formic  acid.  Water  was  added  to  a  final  volume  of  100  >il  and  the  ph  aojusted  to  8.3  by 
adding  solid  Tris  base.  The  digestion  was  done  with  3  pg  of  trypsin  (12h,  37  °C).  The 
mixture  was  acidified  to  ph  3  with  acetic  acid  and  the  generated  peptides  separated  by 
reverse  phase  HPLC.  Sample  in  100  pi  was  injected  onto  a  narrow  bore  RP300  (2.1  mm 
x  10  cm)  and  eluted  with  0  %  to  60  %  of  0.1  %  TFA  in  H20  and  0.085  %  TFA  in 
Acetonitril  respectively. 


Cyanogen  bromide  cleavage.  2  nmoles  of  pyridylethylated  neurotoxin  E  was  dissolved  in  100 
pi  70  %  (v/v)  formic  acid.  The  cleavage  reaction  was  done  at  37°  C  for  24  hours  with  1  mg 
of  cyanogen  bromide.  The  mixture  was  then  lyophilized  to  remove  solvents  and  cyanogen 
bromide.  The  sample  was  dissolved  in  3  pi  of  formic  acid  and  diluted  to  lOOpl  with  0.1  c.’c 
of  TFA  prior  to  injection  onto  a  narrow  bore  BU300  (2.1  x  50  mm)reverse  phase  column. 
Peptides  w'ere  eluted  with  0  %  to  60  %  of  0.1  %  TFA  (v/v)  in  H20  and  0.085  c/'c  TFA 
(v/v)  in  acetonitril  respectively. 

Digestion  with  Glu-C.  Peptides  were  dissolved  in  50  mM  ammonium  bicarbonate  buffer  to 
a  concentration  of  1  -  2  pg/pl.  2  %  (w/w)  enzyme  was  added  and  the  digestion  done  for 
16  hours  at  37  °C.  Separation  of  peptides  was  done  as  described  above. 

Mass  spectrometry.  Mass  spectra  were  recorded  on  either  a  TSQ-70  triple  quadrupole 
instrument  (Finnigar.-MAT,  San  Jose,  CA)  or  a  quadrupole  Fourier  transform  instrument 
(21,23).  Operation  of  these  instruments  for  oligopeptide  sequence  analysis  has  been 
described  previously  (21-24).  Sample  ionization  and  volatilization  by  particle  bombardment 
on  the  TSQ-70  instrument  were  accomplished  with  a  cesium  ion  gun  (Antek,  Palo  Alto,  CA) 
operated  at  6  keV.  For  ion  detection,  the  conversion  dynode  of  this  instrument  was 
operated  at  15  keV.  Samples  for  analysis  on  either  instrument  were  prepared  by  adding  0.5 
to  1  ^il  of  0.1  %  trifluoroacetic  acid  solution  containing  10-100  pmol  of  peptide(s)  to  G.5,nl 
of  a  monothioglycerol  matrix  on  a  gold-plated  stainless-steel  probe.  Electrospray  mass 
spectra  were  recorded  on  the  TSQ-70  instrument  equipped  with  the  newly  developed 
Finnigan  electrospray  source.  The  electrospray  needle  was  operated  with  a  voltage 
differential  of  3-5  kV  and  a  sheath  flow  of  5^1/min  of  a  3/1  mixture  of  methanol/0.50f 
acetic  acid.  Collision  activated  dissociation  experiments  were  conducted  at  energies  of  20- 
25  eV  for  doubly  charged  ions  and  15-18  eF  for  triply  charged  ions.  Argon  at  a  pressure  of 
3.5  mtorr  was  employed  as  the  collision  gas.  Micro-capillary  HPLC  experiments  were 
conducted  with  fused  silica  columns  having  an  inside  diameter  of  75  microns  and  a  lenghth 
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of  75  cm.  The  last  10  cm  of  the  column  was  filled  with  C-18  packing  material.  Peptides 
were  eluted  with  a  gradient  of  0-80%  acetic  acid  (0.5%)/acetonitrile  over  a  20  min  period 
at  a  flow  of  l-2pl/min. 

Peptide  methvl  esters.  100-400  pmol  of  peptide(s)  were  dried  and  carboxyl  groups  esterified 
with  2  M  methanolic  HC1.  The  methanolic  HC1  was  freshly  made  by  dropwise  adding  of  240 
pi  of  acetyl  chloride  into  13  ml  of  methanol.  After  cooling  (5-10min)  20  pi  of  methanolic 
HC1  was  added  to  the  peptide(s)  and  the  reaction  left  at  room  temperature  for  2  hours. 
After  removal  of  the  solvents,  the  peptides  were  assayed  on  mass  spectrometer. 


Automated  Edman  degradation.  Automated  Edmar.  degradation  was  performed  by  standard 
methods  on  a  Model  473  Protein  sequencer  (Applied  Biosystems,  Foster  City,  CA).  Analysis 
of  PTH  amir.o  acids  was  done  on  line  with  a  type  140  A  HPLC  system.  Data  recording  and 
analysis  was  done  on  a  McIntosh  Ilx  computer  (Apple  Computer,  Inc.,  Cupertino,  CA)  with 
the  Applied  Biosystem  software. 
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3.  RESULTS 

Digestion  of  botulinum  neurotoxin  type  E  was  done  with  different  proteases  and  with 
cyanogen  bromide.  One  of  the  problems  to  obtain  complete  digests  of  pyrldylethylated 
neurotoxin  E  is  its  relative  insolubility  in  aqueous  solvents.  We  did  choose  several  ways  to 
circumvent  this  problem.  These  include  the  solubilization  of  the  protein  in  the  presence  of 
6  M  guanidine/HCl,  in  the  presence  of  SDS  and  CHAPS,  or  with  concentrated  formic  acid. 
As  proteases  are  not  normally  active  under  these  conditions  the  solvents  had  to  be  adjusted 
to  be  compatible  with  the  corresponding  protease. 

Tiypsin  digestion.  So  far  the  most  successful  and  best  characterized  method  is  using  formic 
acid  prior  to  the  digestion.  In  figure  1  is  shown  the  HPLC  trace  of  a  digest  of 
pyridylethylated  botulinum  neurotoxin  E  with  trypsin.  For  this  digest  neurotoxin  E  was  first 
dissolved  in  a  minimal  volume  of  concentrated  formic  acid.  Prior  to  adding  the  trypsin  the 
solution  was  diluted  and  the  pH  adjusted  to  8.3  with  Tris-base.  Liquid  secondary  ion  mass 
spectra  were  recorded  on  the  TSQ-70  mass  spectrometer  for  the  fractions  16  to  47.  Table 
1  lists  the  most  prominent  masses  found  in  each  individual  fraction.  Mass  spectra  of  the 
individual  fractions  are  shown  in  appendix  A.  Every  fraction  contains  between  2  to  5 
peptides.  Being  mixtures  of  a  1'mited  number  of  peptides  these  fractions  are  ideal  samples 
to  do  the  sequencing  with  the  triple  quadrupole  mass  spectrometer. 


Figure  1.  High  pressure  liquid  chromatogram  of  a  tryptic  digest  of  botulinum  neurotoxin  E. 
Separation  on  a  reverse  phase  narrow  bore  column,  RP300  (2.1  x  100  mm). 


Table  1:  Mass  values  (M  +  H)*  of  peptides  in  HPLC  fractions  of  a  tryptic  digest  of 
neurotoxin  E.  Masses  were  recorded  by  liquid  secondary  ion  mass  spectroscopy  on  the  TSQ-' 


70. 

fraction 

(M  +  H)* 

16 

621 

896 

17 

779 

801  898 

1129 

18 

895* 

1131*  1227* 

1376* 

19 

509 

607  886 

954 

1097 

20 

842 

886 

21 

886 

1388* 

22 

739 

1133  1280 

23 

750 

1086  1117 

1134* 

1262  1569*  2263 

24 

750 

1134 

25 

750 

916  1132 

1380 

26 

545 

608  837 

911 

1329 

27 

1046 

1329  2139* 

28 

1046 

1526  1917 

1978 

29 

784 

1292  1526 

30 

926 

1138  1292 

1736 

1853* 

31 

1342* 

1865*  2223 

32 

1112 

1342  2223 

33 

996 

34 

1376* 

1504*  1694* 

35 

947" 

1152*  1779* 

1876* 

2802* 

36 

727 

853  1039 

1901 

37 

1042 

1223  1513 

1555 

2467 

38 

755 

1436*  2409* 

2470 

39 

1264* 

1420*  2470* 

40 

1089 

1266  1458 

41 

1365 

1719  1820 

2287 

42 

1715* 

2308* 

43 

1157 

1244 

44 

1157 

1604  1969 

2513 

45 

900* 

2512* 

46 

2009 

2835 

47 

1898* 

2012* 

'sequences  of  peptides  determined  (see  table  2) 
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Generally  the  mass  of  the  peptide  to  be  sequenced  is  selected  in  the  first  quadrupole.  In 
the  second  quadrupole  this  selected  peptide  is  subjected  to  fragmentation  by  collision  with 
argon.  Resulting  masses  of  the  fragments  are  analyzed  in  the  third  quadrupole.  Normally 
the  recording  of  one  collision  activated  mass  spectrum  is  insufficient  for  the  complete 
determination  of  the  sequence.  Additional  information  has  to  be  obtained.  With  the 
exception  of  the  differentiation  of  isoleucine  and  leucine,  which  do  have  the  same  molecular 
masses  this  additional  information  can  normally  be  obtained  by  subjecting  the  peptide(s) 
to  selective  modification  prior  to  another  mass  spectral  analysis.  Whereas  esterification  in 
methanolic  HC1  results  in  the  identification  of  carboxyl  groups,  acetylation  is  normally  used 
to  identify  free  amino  groups.  We  also  used  automated  Edman  degradation.  Also  the 
combination  of  mass  spectrometry  with  automated  Edman  degradation  showed  to  be  very 
favourable  under  certain  conditions.  As  an  example  the  sequencing  of  peptides  contained 
in  tryptic  fraction  35  is  described.  Shown  in  figure  2  is  the  mass  spectrum  of  fraction  35 
which  contains  five  peptides  with  the  masses  947,  1152, 1779, 1875  and  2802.  We  concluded 
to  be  able  to  obtain  collision  activated  spectra  by  liquid  secondary  ion  mass  spectrometry 
of  the  single  charged  ions  of  the  four  peptides  947,  1152,  1779,  and  1875. 

- 


Figure  2.  Mass  spectrum  recorded  on  HPLC  fraction  35  of  the  tryptic  digest  of  botulinum 
neurotoxin  E. 
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The  collision  activated  mass  spectra  of  these  peptides  are  shown  in  figure  3a  •  3d.  For  the 
peptide  2802  we  decided  to  choose  electrospray  ionization  and  recorded  the  spectrum  of 
the  triple  charged  ion.  The  collision  activated  mass  spectrum  is  shown  in  figure  4.  Although 
sequence  information  can  be  obtained  from  all  these  mass  spectra,  further  information  is 
needed  to  obtain  a  complete  sequence  for  all  five  peptides.  We  decided  to  subject  the  total 
fraction  to  automated  Edman  degradation.  The  cycles  of  these  degradation  are  shown  in 
figure  5.  Note  that  no  sequence  information  can  be  obtained  from  these  cycles  due  to  the 
complexity  of  the  fraction. 
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Figure  3.  CAD  mass  spectra  of  tryptic  peptides  of  botuiinum  neurotoxin  E  recorded  on  the 
(M  +  H).  ions  at  m/z  947  (a),  1152  (b),  1779  (c)  and  1875  (d).  Possible  fragment  masses  are 
indicated  on  the  top.  Underlined  are  fragments  which  are  identified  in  the  mass  spectrum. 
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Figure  4.  CAD  mass  spectrum  of  the  tryptic  peptide  of  botulinum  neurotoxin  E  recorded 
on  the  (M  +  H)~*  ions  at  m/z  934.  Possible  fragment  masses  are  indicated  on  the  top. 
Underlined  are  fragments  which  are  identified  in  tse  mass  spectrum. 

From  the  collision  activated  mass  spectra  we  concluded  that  the  peptides  947,  1779,  187.-> 
and  2802  would  contain  lysine  at  the  C-terminus,  first  in  all  four  cases  we  see  the 
corresponding  fragment  y-ion  (mass  ■  147),  secor.d  we  used  trypsin  for  the  digestion.  For 
peptide  1152  the  identify  of  the  C-terminus  is  not  obvious.  Lysine  as  well  as  arginine  can 
be  excluded  due  to  the  lack  of  the  corresponding  fragment  y-ions  (mass  =  147  or  175 
respectively).  However  we  normally  observe  some  chymotryptic  activity  in  the  trypsin, 
especially  after  prolonged  time  of  digestion,  which  would  give  at  least  some  clues  for  the 
identification  of  the  C-terminus.  From  the  Edman  cycles  we  found  lysine  in  the  position  7. 
15,  17  and  26.  The  assignment  of  residues  17  to  26  of  peptide  2802  is  straightforward  by 
comparison  of  the  col'ision  activated  mass  spectrum  (fig.  4)  and  the  Edman  cycles  (fig.  5). 
Note  that  serine  in  position  21  connot  be  seen  in  the  Edman  degradation,  however  it  can 
be  identified  as  mass  difference  between  y6  and  y5  respectively  (fig.  4).  Position  17  in  the 
automated  Edman  degradation  shows  two  amino  acid  residues,  isoleucine  and  lysine.  Lysine 
is  the  C-terminus  of  peptide  1875.  Isoleucine  is  in  peptide  2802  (mass  difference  v10  -  v,  = 
113,  fig.  4).  Position  16  shows  two  residues  cs  well,  asparagine  and  histidine.  Asparagine  K 
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in  peptide  1875  (mass  difference  y2  -  y,  *  114,  or  b16  -  b1$  *  114,  fig.  3d).  Histidine  is  in 
peptide  2802  (mass  difference  yn  -  y1c  *  137,  fig.  4).  Note  also  that  due  to  the  presence 
of  histidine  yn  can  also  be  seen  as  doubly  charged  ion.  Position  15  in  the  automated  Edman 
degradation  contains  three  amino  acid  residues,  phenylalanine,  isoleucine  and  lysine.  Lysine 
is  the  C-terminus  of  peptide  1779.  Isoleucine  is  in  peptide  1875  (mass  difference  ys  -  y2  = 
113,  or  b15  -  bu  *  113,  fig.  3d).  Phenylalanine  is  in  peptide  2802  (mass  difference  y12  *  y,, 
■  147,  fig.  4).  y12  as  well  can  be  seen  as  doubly  charged  ion,  again  due  to  the  presence  of 
histidine  in  position  16  of  this  peptide.  In  the  same  way,  step  by  step,  the  amino  acid 
residues  are  assigned  to  the  corresponding  peptide.  This  step  by  step  assignment  can  be 
done  by  starting  at  either  end  the  N-terminus  or  the  C-terminus.  Both  ways  should  finally 
end  in  identical  sequence  assignment. 

Chvmotrvptic  digest.  As  a  second  example  we  describe  the  sequencing  of  a  peptide  from 
a  chymotryptic  digest.  The  chymotryptic  digest  of  botulinum  neurotoxin  E  was  done  by  Dr. 
James  Schmidt  and  the  fraction  provided  for  analysis.  Shown  in  figure  6  are  the  collision 
activated  mass  spectra  of  the  peptide  1330  and  its  methyl  ester  form,  peptide  1372.  To 
interpret  the  spectra  fragments  containing  the  N-terminus  can  be  compared.  The  mass 
difference  between  (M  +  H)4  and  bn  is  131,  this  indicates  the  presence  of  either  leucine  or 
isoleucine  on  the  C-terminus.  This  is  in  agreement  with  the  fact  that  chymotrypsin  was  used 
for  cleavage.  The  next  fragment.  b,0  is  113  mass  units  lower  than  bn,  this  indicates  the 
presence  of  another  leucine  or  isoleucine.  The  mass  difference  between  b,0  and  b,  is  87. 
The  third  residue  from  the  C-terminus  is  therefore  serine.  In  a  similar  way  residues  are 
identified  step  by  step.  The  shift  of  42  indicates  the  presence  of  three  carboxyl  groups,  the 
C-terminus  and  two  aspartic  or  glutamic  acids.  The  following  sequence  information  can  be 
obtained  from  the  analysis  of  the  two  spectra:  XDGNXXDQ/KQ/KSXX,  where  X  is  either 
leucine  or  isoleucine.  Note  also  that  a  differentiation  between  lysine  and  glutamine  is  not 
possible.  Acetylation  of  amino  groups  and  mass  spectral  analysis  could  give  the  additional 
information  needed.  We  decided  however  to  subject  the  fraction  to  automated  Edman 
degradation,  this  mainly  to  also  differentiate  between  leucine  and  isolcucine  which  are 
rather  abundant  in  this  particular  peptide.  From  these  we  found  the  sequence  of  the 
chymotryptic  peptide  ’.330  to  be:  1DGNL1DQKSIL. 
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Figure  6.  CAD  mass  spectrum  of  a  chymotryptic  peptide  of  neurotoxin  E  recorded  on  the 
(M  +  H'  ions  for  the  unmodified  form  at  m/z  1330  (a)  and  for  the  methvl  ester  form  at 
m/z  1372  (b). 
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Cyanogen  bromide  cleavage  and  Glu-C  subdigest.  As  a  last  example  we  present  the  analysis 
of  a  fraction  from  a  cyanogen  bromide  digestion.  Figure  7  shows  the  HPLC  chromatogram 
of  cyanogen  bromide  treated  neurotoxin  E.  When  we  subjected  the  most  prominent  peak 
7  (fig.  7)  to  automated  Edman  degradation  we  found  two  peptides  in  this  fraction.  We  were 
able  to  obtain  information  for  22  residues.  The  following  residues  eluted  in  the  same 
stochiometric  amount.  From  this  sequencing  alone  we  were  not  able  to  deduce  any 
sequence  information  for  the  individual  peptides. 


Residue  no.:  15  10  15  20 

peptide  1:  YQALQNAVNAI  KTI  I  ENVKTYL 
peptide  2:  KLI  NEVKI  RKLREYF  KAKYNS  I 


T 


Figure  7.  High  pressure  liquid  chromatogram  of  a  cyanogen  bromide  cleavage  of  botulinum 
neurotoxin  E.  Separation  on  a  reverse  phase  narrow  bore  column,  BU  300  (2.1  x  50  mm). 


To  obtain  more  information  from  this  fraction  and  to  assign  the  individual  amino  acid 
residues  from  the  Edman  degradation  we  subjected  this  cyanogen  bromide  fraction  to 
digestion  wi'h  the  protease  Glu-C.  The  HPLC  chromatogram  of  this  digest  is  shown  in 
figure  8. 
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The  individual  fractions  from  this  Glu-C  digest  were  then  analyzed  on  the  triple  quadruple 
mass  spectrometer.  CAD  mass  spectra  of  the  fractions  5  (m/z  =  617),  21  (m/z  =  1848)  and 
26  (m/z  =  2177)  are  shown  in  figure  8.  The  corresponding  sequences  are  shown  on  top  of 
the  figure  with  the  corresponding  fragment  masses.  All  three  sequences  can  be  identified 
as  part  of  the  two  peptides  as  found  in  the  automated  Edman  degradation  (see  above). 
From  the  peptides  2177  and  1848  we  can  assign  the  residues  to  peptide  1.  Only  one  peptide 
617  can  be  assigned  to  be  part  of  the  second  peptide  in  that  cyanogen  bromide  fraction.  We 
have  not  been  able  to  identify  any  further  peptide.  However  the  partial  sequence  of  this 
second  peptide  in  cyanogen  bromide  fraction  7  can  be  constructed  unambiguously. 


Figure  8.  High  pressure  liquid  chromatogram  of  a  subdigest  with  Glu-C  protease.  Cyanogen 
bromide  fraction  7  (fig.  7)  was  subjected  to  Glu-C  (V8)  digestion  and  then  separated  on  a 
narrow  bore  reverse  phase  column,  RP  300  (2.1  x  50  mm). 
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Figure  9.  CAD  mass  spectra  of  Glu-C  protease  fragments  of  botulinum  neurotoxin  E 
recorded  on  the  (M  +  HV*  ions  at  m/z  =  617  (a),  m/z  =  1848  (b)  and  m/z  =  2177  (c). 
Fragments  are  indicated  on  top.  Underlined  are  fragments  which  are  seen  in  the  mass 
spectrum. 


In  a  similar  fashion  we  analyzed  several  other  fractions.  Table  2  shows  a  summary'  of  well 
established  sequence  information.  As  a  ongoing  project  a  number  of  fractions  are  not  yet 
sufficient  characterized.  More  sequence  information  can  be  obtained  by  further  detailed 
analysis  of  other  fractions. 
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Table  2:  Sequences  of  peptides  of  botulinum  neurotoxin  E. 


MW" 

Fraction  no.b 

Sequence£ 

895 

Tr-18 

SSSVNNMR 

1227 

Tr-18 

QALQNQVNAIK 

1376 

Tr-18 

IKPGGCQEFYK 

1388 

Tr-21 

VQVSNPQLNPYK 

1134 

Tr-23 

VSIAMNNIDR 

1569 

Tr-23 

INSFNYNDPVDNR 

2139 

Tr-27 

YVDTSGYDSNIDINGDVYK 

1853 

Tr-30 

NVIGTTPQDFHPPTSLK 

1342 

Tr-31 

IGLALNIGNEAQK 

1865 

Tr-31 

TILYIKPGGCQEFYK 

1376 

Tr-34 

NNNGNN1GLLGFK 

1504 

Tr-34 

LNLTIQNDAYIPK 

1694 

Tr-34 

THLFPLYADTATTNK 

947 

Tr-35 

YFNIFDK 

1152 

Tr-35 

LSNLLNDSIY 

1779 

Tr-35 

eqmyqalqnqvnaik 

1876 

Tr-35 

LAFNYGNANGISDYINK 

2802 

Tr-35 

ANPYLGNDNTPDNQFHIGDASAVEIK 

1436 

Tr-38 

LYSFTEFDXATK 

2409 

Tr-38 

VSLNHNEIXWTLQDNAGINQK 

1264 

Tr-39 

WIFVT1TNDR 

1420 

Tr-39 

FLTESSISYLMK 

2470 

Tr-39 

VPEGENNVNLTSSIDTALLEQPK 

1715 

Tr-42 

1NNNLSGG1LLEELSK 

2308 

Tr-42 

V1IMGAEPDLFETNSSNISLR 

900 

Tr-45 

NFSISFW 

2512 

Tr-45 

LSNLLNDSIYNISEGYNINNLK 

1898 

Tr-47 

SILNLGNIHVSNNINFK 

2012 

Tr-47 

EYYLLNVLKPNDFINR 

596 

Ch-31/15 

TIKSF 

586 

Ch-31/16 

MPSNH 

749 

Ch-45/5 

GAEPDLF 

1598 

Ch-45/6 

NYNDPVNDRTILY 

964 

Ch-47/5 

KAJNIEEF 

1059 

Ch-50/2 

ENDLOVIL 

1330 

Ch-49/7 

IDGNLIDQKSIL 

CB-21 

YQALQNQVNAIKTIIENVKTYLLNYLLOHGSILGESE 

CB-21 

KLINEVKIRKLREYDKAKYNSY 

CB-24 

.NIW1IPER 

X 


19 

2890  CB/V8-5  LSKANPYLX3NDNTPDNQFHIGDASAVE 

CB/V8-8  RNV1GTTPQDFHPPTSLK.GDTSY 


•molecular  weight  of  the  (M  +  H)*  as  determined  with  the  TSQ-70  mass  spectrometer. 
bfor  digestion  we  used  trypsin  (Tr),  chymotrypsin  (Ch),  Staph,  aureus  V8,  Glu-C  (V8),  and 
cyanogen  bromide  (CB). 

'sequences  listed  in  one  letter  code.  X  =  I  or  L  and  period  were  residue  not  known. 


4.  DISCUSSION 

Only  recently  the  complete  sequences  of  botulinum  neurotoxin  A  (17)  and  neurotoxin  Cl 
(18)  was  presented.  Together  with  the  sequence  of  neurotoxin  A  was  published  a  apart  of 
the  sequence  of  botulinum  neurotoxin  E  including  the  N-terminus.  Further  sequence 
information  of  this  neurotoxin  E  could  also  be  expected  soon.  Homologie  alignment  of 
botulinum  neurotoxins  A  and  Cl  as  well  as  of  tetanus  toxin  is  shown  in  figure  8.  The 
alignment  was  done  with  the  program  CLUSTAL  in  PCGene  (IntelliGenetics  Inc.,  Geneva, 
Switzerland)  which  uses  the  method  of  Higgins  and  Sharp  (25).  In  addition  the  comparison 
between  botulinum  toxin  A  and  tetanus  toxin  has  already  been  described  (17).  We 
compared  our  sequences  of  botulinum  neurotoxin  E  (table  2)  with  these  three  proteins  by 
using  the  programs  SCANSIM  and  QGSEARCH  in  PCGene.  The  region  of  the  highest 
homology  is  indicated  in  figure  10.  For  most  of  the  peptides  (table  2)  we  found  sufficient 
homology  to  determine  '.he  relative  position  of  the  peptide.  As  can  be  seen  the  approx.  40 
%  of  the  total  possible  sequence  is  distributed  very  well  over  the  whole  range  of  the 
protein.  This  observation  is  insofar  important  as  it  would  exclude  major  parts  of  the  protein 
from  beeing  digested  and  therefore  beeing  accessible  to  sequencing.  As  the  tryptic  digest 
(figure  2)  is  not  completely  analyzed  with  regard  to  sequences,  further  work  has  to  be  done 
to  determine  how  much  of  the  total  sequence  can  be  obtained  by  analysis  of  one  single 
digest. 

As  the  number  of  published  gene  sequences  is  increasing  the  importance  of  sequence 
analysis  on  the  level  of  the  protein  oi  the  corresponding  peptides  shifts  more  towards 
analysis  of  posttranslational  modifications.  Search  for  such  modifications  however  requires 
the  knowledge  of  the  complete  sequence.  Neurotoxin  E  is  only  partially  sequenced  to  date. 
Therefore  further  work  is  necessary  to  completely  sequence  this  protein,  this  can  be 
achieved  by  sequencing  the  gene  or  by  continuing  the  sequencing  on  the  protein  level.  Once 
for  example  the  complete  sequence  is  available,  very  detailed  analysis  of  our  data  with 
regard  to  posttranslational  modifications  is  greatly  facilitated.  Posttranslational  modifications 
are  extremely  important  for  the  activation  of  these  group  of  toxins  as  already  mentioned 
in  the  introduction.  Up  to  now  very  little  is  actually  known  about  the  exact  mechanism  of 
activation  which  means  to  conversion  of  the  inactive  precursor  protein  to  the  actual  toxic 
component. 


Figure  10.  Homology  comparison  of  sequences  of  botulinum  neurotoxins  A  and  Cl  and 
tetanus  toxin.  And  comparison  with  sequences  obtaines  in  our  laboratory  (table  2).  For 
sequences  see  references  (13  -  18). 
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Appendix..  B 

List  of  additional  preliminary  sequence  information,  obtained  from  various  digests.  X  =  He  or  Leu;  Z  * 
Gin  or  Lys;  Tr  =>  trypsin;  Ch  ■  chymotrypsin;  An  =  protease  Asp-N. 


m/z 

protease 

sequence 

m/z  protease 

sequence 

407 

Tr 

FXK 

471  Tr 

SXPR 

545 

Tr 

VDAXVK 

630  Tr 

ZXNZK 

547 

Tr 

TSNXX 

622  Tr 

YXGXR 

644 

Tr 

GXXTXK 

659  Tr 

XNVSVK 

681 

Tr 

NYGSXK 

690  Tr 

TXXESK 

701 

Tr 

NZ(NX)GR 

739  Tr 

SFNXMK 

740 

Tr 

SFNXMK 

750  Tr 

FDNXXK 

743 

Tr 

(XN)XEVK 

763  Tr 

SMXANAR 

780 

Tr 

(-)YXVK 

817  Tr 

WEEXXK 

849 

Tr 

FXZXVTK 

884  Tr 

XVGZPTNR 

887 

Tr 

STXXXANR 

897  Tr 

XXQPXTGR 

907 

Tr 

XYSGXQVK 

935  Tr 

(OX)SEVMTK 

986 

Tr 

(AR)VSVANXR 

1041  Tr 

NXWXXPER 

1134 

Tr 

XKSSSVXNMR 

1160  Tr 

(QA)WTESXDR 

1185 

Tr 

WDSDXSXXPK 

1196  Tr 

DXDTXYETAR 

1202 

Tr 

YGXPVXADXNK 

1271  Tr 

(DZ)XXXNHGFSK 

1416 

Tr 

(-TTX)SMVPZKR 

1526  Tr 

ZNZVYXYVVASK 

779 

Tr 

XNFZEK 

999  Tr 

XXXSYFN/DK 

688 

Ch 

SNXZNX 

768  Ch 

F/MRHYM 

801 

Ch 

(DZ)AXEXX 

812  Ch 

NHEXNW 

829 

Ch 

XNEVZNX 

896  Ch 

XXZPXTGR 

1102 

Ch 

DXZZXENEX 

1282  Ch 

(PE)XVNZPVZAAX 

962 

Ch 

XZNVTZXF 

732 

Tr/An 

YGXPVXA 

819  Tr/An 

DPXFXSK 

945 

Tr/An 

DTGVXSXXK 

961  Tr/An 

HTHSFVYA 

985 

Tr/An 

DNNTAXXPK 

1023  Tr/An 

DNVNXVPNK 

1471 

Tr/An 

DXZZXEXEXNZK 

